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Instantons causing iterative decoding to cycle 

Misha Stepanov 



Abstract — It is speculated that the most probable channel noise 
realizations (instantons) that cause the iterative decoding of low- 
density parity-check codes to fail make the decoding not to 
converge. A simple example is given of an instanton that is not 
a pseudo-codeword and causes iterative decoding to cycle. A 
method of finding the instantons for large number of iterations is 
presented and tested on Tanner's [155,64,20] code and Gaussian 
channel. The inherently dynamic instanton with effective distance 
of 11.475333 is found. 

Index Terms — iterative decoding, LDPC codes, error floor. 



I. Introduction 

Low-density parity-check (LDPC) codes ID, El, with 
iterative decoding got a lot of attention due to their excellent 
performance. The decoding error probability is larger than one 
could expect when the Signal-to-Noise Ratio (SNR) is high, 
a phenomenon called error floor |4|, |5|. 

In some cases the substructures of the code that provide a 
leading contribution to the error probability are known: they 
are codewords in the case of maximum likelihood decoding, 
and stopping sets (6) in the case of iterative decoding and 
binary erasure channel. For general situation several heuristics 
were introduced: near-codewords [4] or trapping sets as 
bits subsets that violate just a few parity checks, pseudo- 
codewords [3J as the codewords on computational tree, stop- 
ping sets, pseudo-codewords as non-codeword vertices of a 
polytope used in linear programming decoding Q, [fully] 
absorbing sets |8|, and instantons |9|. Even if the description 
of the deleterious substructures is available, it still could be a 
non-trivial problem to find them. 

LDPC codes can be defined by parity check matrix H or 
Tanner graph [ 1 1 which is a sparse bipartite graph with two 
sets of vertices: bits and parity checks. The notation /o-ooc is 
used to indicate that H a i = 1 and the bit i and the check a are 
connected by an edge. 

The binary (made of +1 and —1 numbers, or just "+"s 
and "— "s) codeword a = (g\,G2, ■ ■ ■ ,&n) is transmitted over 
a noisy channel with continuous output x = (x\ ,X2, ■ ■ ■ ,xn). 
In the paper the channel is assumed to be memoryless, i.e., 
P(x\a) = n^=i P{xi\Oi). The decoder takes the logarithmic 
likelihoods h t = (l/2)log (P(+|x,)/P(-|x,)) at each bit i as 
an input, where P(±\x) = P(x\±)/(P(x\+)+P(x\-)). 

The iterative decoding that is used in the paper is the min- 
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with checking at each iteration whether the current output C = 
signm is a valid codeword, i.e., H ■ ((1 — c)/2) =0 (mod 2) 
(and if it is, the iterations stop). At the beginning of the 



decoding there are no messages to bits, /j 



(-1/2) - 



0. 



Let us define the noise vector S, = (^1,^2, ■ ■ • ,£,n) by ^; = 
1 — OjXj. For simplicity, the channel is assumed to be symmet- 
ric, P(— x\+) — P(x\— ), then the decoding error probability 
(and error causing noise configurations) is independent of the 
codeword C being sent. 

Consider the error correcting code, the transmission chan- 
nel, and the decoding algorithm (including the [maximal] 
number of iterations) being fixed. The channel noise space 
9\[ is then divided into two sets: 9{\E and E, noise re- 
alizations that are decoded successfully and the ones that 
result in the decoding error. The instantons are defined as the 
positions of local maxima of the noise distribution density 
P(%>) — Tl?=\P{^ ~~ over m e set of error causing noise 

configurations £. In the limit of high SNR the probability 
of the decoding error somewhere in the information block, 
Frame-Error Rate (FER), is controlled by the instanton with 
maximal and its vicinity. (In order to describe the FER vs. 
SNR dependence in the moderate SNR region one may need 
to collect the contribution from several instantons.) Such a 
definition of instanton is a paraphrasing of "source of trouble", 
and is practically useless without a method to locate it. 

The noise configuration is said to withstands n iterations 
if after n iterations the decoding output is still wrong (that 
includes the case when the decoding output is a wrong 
but valid codeword, the noise then withstands °° iterations). 
Checking for the output being a codeword at each iteration 
makes the set £ being a non-increasing function of the number 
of iterations: £(ni ter + 1) C £(ni ter ). 

The min-sum decoding is the high SNR limit of 
the sum-product algorithm. In addition, if P(l — = 
exp ( - p(SNR) -F(%)) /Z(SNR) for some increasing function 
P(SNR), then the decoding input has the form h(^) = p • 
(F(2-£)-F(£)) /2. (This includes Additive White Gaussian 
Noise (AWGN) channel with F(£) = and h = 2|3(1 -£).) 
As the min-sum decoding is scalable (i.e., the result of the 
decoding stays the same if the decoding input vector h is 
multiplied by a positive number), the set T, is independent 
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of SNR, so are the instantons. 

II. Cycling of iterations 

One could imagine several possibilities how the iterative 
decoder could fail: 

R: The iterative decoding was converging to the right solu- 
tion, but it didn 't succeed during the allowed number of 
iterations. 

W: The iterative decoding converged but to a wrong place. 
After the convergence the decoding output is a codeword, 
just not the one that was sent. 
P: The iterative decoding converged but to a wrong place. 
After the convergence the decoding output is not even a 
codeword. 

C: The iterative decoding is not going to converge no matter 
how many iterations you can afford. 
The situation R can be corrected by adding more iterations. 
It is highly possible that in situation W even maximum 
likelihood decoding would make an error, and the probability 
of such a situation [in the presence of error floor] is very small, 
thus the error because of possibilities P or C is much more 
probable. 

Following the so-called Bethe free energy variational ap- 
proach ifTTl . belief propagation can be understood as a set 
of equations for beliefs solving a constrained minimization 
problem. On the other hand, a more traditional approach is to 
interpret belief propagation in terms of an iterative procedure 
— so-called belief propagation iterative algorithm [1J, [12|, 
||T3l . Being identical on a tree (as then belief propagation 
equations are solved explicitly by iterations from leaves to 
the tree center) the two approaches are however distinct for a 
graphical problem with loops. In case of their convergence, be- 
lief propagation algorithms find a minimum of the Bethe free 
energy ifTTIl . lfl4ll . lfl5l . however in a general case convergence 
of the standard iterative belief propagation is not guaranteed. 

Experiments with the Tanner's [155,64,20] code ifTBI 
showed the following: The instanton for linear programming 
decoding [7|, that is minimizing a certain part of the Bethe 
free energy and is not iterative in nature, for AWGN channel 
has the effective distance close to 16.4 IT71 . fl8l . At the 
same time the noise configuration \ with effective distance 
or weight = Y!Li f &) = HSIli ~ 1245 which withstands 
410 iterations was found 1 19 1. There is a strong indication that 
in the close vicinity of this noise configuration there are ones 
that withstand arbitrary large number of iterations. 

If the decoder provides errors mostly due to situation P, 
then it converges in most occasions. The fixed point of iterative 
decoding is the minimum of Bethe free energy. Thus, the iter- 
ative decoder should work not worse than linear programming 
decoder, as the latter neglects a certain part of Bethe free 
energy. That contradicts to what was observed experimentally 
for the Tanner's [155,64,20] code: 12.45 < 16.4. 

In contrast with the decoding algorithms which are static 
(e.g., linear programming decoding), in the case of iterative 
decoding the instantons could be inherently dynamic, and in 
order to find them the dynamics of iterations [in full details] 
should necessarily be considered. 
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Fig. 1. Decoding dynamics on the instanton ^= (10,6.4,4)/7. The vector 
h is proportional to (—3,1,3,3), and (as the decoding is scalable) the latter 
was used as h to form the table. The general formula at the end starts to be 
applicable from n > 1 , while at iteration k = 2 it is not valid yet. 



As an example of cycling of iterations, consider a simple 
code with 4 bits and 5 parity checks: 

/ 1 1 \ 
110 
H= 11 
10 1 

Viiiiy 

The parity checks are obviously redundant, and the code has 2 
codewords: (+,+,+,+) and (—,—,—,—). As the first 4 parity 
checks have connectivity 2, the only pseudo-codewords are the 
codewords. Because of the checks with connectivity 2 all the 
bits (even on a graph-cover) should have the same values. 

The lowest instanton that survives infinite number of itera- 
tions is % = (10,6,4,4)/7 with the weight w(\) = (10 2 +6 2 + 
4 2 +4 2 ) /7 2 = 168 /7 2 = 24/7 < 4 (the numeration of bits goes 
along the 8 -cycle containing the checks with connectivity 2). 

The cycling dynamics of iterations is shown at Fig. [T] 
The decoding output (and the messages bits ■<-» checks) is not 
exactly periodic with the iteration number. If one considers 
one iteration of the decoder as a mapping in the space of 
messages 1\, then the instantons are not necessarily periodic 
orbits (i.e., exact cycles) of the mapping. 

III. Instantons array 

The instanton-amoeba scheme 0, lfl9ll while being quite 
effective in getting instantons for small number of iterations 
"iter (with about 10 iterations being the maximum in practice) 
is having difficulties in finding the instantons for large zii ter . 
The problem is with the rough landscape of the function 
amoeba tries to optimize. The moves amoeba does do assume 
that the landscape is regular (see l20l . |2T]). The problem 
with the application of downhill simplex/amoeba method to 
finding /i; ter instantons is that amoeba always aims for noise 
configurations that withstand n llev (i.e., many) iterations. The 
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Fig. 2. Two-dimensional cuts of the [155-dimensional] noise space that 
contain zero noise vector and the lowest weight instanton 4a f° r Tanner's 
[155,64,20] code and AWGN channel. The line going through and 4a is 
horizontal. The plane of the cut is determined by the 3rd point it goes through. 
In panels (a), (b), and (c) it is the instanton 4b; in panel (d) it is a random 
vector; and in panel (e) it is the vector t with /go = t\rx = foo =*131 = 'i36 — 1 
and all other components being 0. The labels 0, 1,2, 3, and 4 indicate how 
many iterations the noise withstands in this area. The tone of gray is calculated 
as (9 — log 2 «)/l 1, where n is how many iterations the noise configuration 
withstands, with 0/1 being black/white. Tones 10/11 and 1 correspond to 
n = and correct decoding without any iterations (i.e., 4; < 1 for all i). 



set T, (niter) of such noise configurations [for large «i ter ] is 
very irregular near its boundary (see Fig. p), and amoeba is 
getting confused and uncontrollably reduces its size without 
any progress. 

The algorithm shown in Fig. [3] and described below over- 
comes this difficulty and is able to find instantons for large 
liter- The procedure deals with the array of noise configura- 
tions, \{k), k = 0, 1, . . . , njter;max> where at any time the noise 
^(k) is the one with the largest P(^) (or the lowest weight 
= Yli=i F(£>i)) from all the withstanding k iterations noise 



L1 
L2 
L3 
L4 
L5 
L6 



start with the noise vector £ = (1,1, ...,1) 
check some (may be empty) list of noise vectors 

for k = Q, 1, n ite r;max 

perturb 

check perturbed noise vector 
go to L3 or exit 



Fig. 3. Iterative decoding instanton search algorithm. 
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Fig. 4. The two lowest instantons 4a an d 4b- The tone of gray is calculated 
as 1 —4/2, with 4 = 2 / 4 = being black/white. The 155 x 93 parity check 
matrix ft consists of three blocks: (R 1 R 2 R 4 i? 8 R [6 ), [R 5 ,R W R 20 R 9 i? 18 ), 
(R 2S R 19 R 1 R H R 28 ), where R is the 31 x 31 matrix that cyclically shifts a 
column vector up by one component. 




Fig. 5. Iterative decoding output m on the instantons 4a and 4b , 200 iterations 
running from top to bottom are shown. The tone of gray is calculated as 
(l+m/10)/2, with 0/1 being black/white. Middle gray (tone 1/2) corresponds 
to undecided output m = 0. The decoding input h = m' ' is shown at the top 
for comparison of input and output magnitudes. 



configurations that were encountered in the procedure so far. 
(The updates of \(k) are done in the line L5 and (at the start) 
in the line L2.) 

In the line L1 of the algorithm the output of the channel 
is completely undecided (h = (0,0, .. . ,0)). This configuration 
obviously withstands °o iterations, although P(£) at it is quite 
low. This step makes \(k) = (1, 1, . . . , 1) for all k = 0, 1, . . . , 

^iter;max- 

In the line L2 the noise configurations that are known from 
some external source (e.g., from previous runs of the procedure 
or from the analysis of trapping sets or pseudo-codewords) 
may be introduced as a starting point for instanton search. 

This procedure, applied to Tanner's [155,64,20] code and 
AWGN channel, with «iter;max = 100, produced an instanton 
£a with the lowest weig ht w(£ A ) = \%a\\1 < 11.475333 that 
causes iterations to cycle with the period of length 12 (see 
Fig. [5}. The next instanton £b has the weight vv(^b) ~ 1 1 -4996. 
The differences in weight for configurations that withstand 20 
or more iterations are very small. Submitting the array \(k) as 
an initial state of the procedure with larger Mi te r;max relatively 
quickly produces noise configurations with very close weight 
that withstand larger ni te r;max iterations. 
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Fig. 6. The effective weight w of the withstanding «j ler ;max = 100 iterations 
noise configuration for the Tanner's [155,64,20] code and AWGN channel vs. 
CPU (Intel Xeon X3360, 2.83 GHz) time, 500 realizations are shown. The 
feedback for the amplitude a is A (upper panel) and D (lower panel). 
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Fig. 7. The probability/frequency of occurrence, p(w), of the withstanding 
"iter;max = 100 iterations noise configurations with the weight w or smaller 
after 1, 2, 3, 4 (dashed curves) and 5, 10, 30 (solid curves) minutes of 
CPU time. Code, channel, feedback, and CPU are the same as in Fig. [6] 



Below are the details of the procedure used to generate 
Figs. [6] and [^j] The noise vector £ is perturbed as £ — > c£ + a\|/, 
where the components of \|/ are independent standard normal 
random variables. The coefficient c = y/l — a 2 iV/w(£) < 1 
makes the expected value E ||c£+al|f II2 = c2w (Q) +a 2 N = w(£) 
not being systematically increased by the addition of a\|/. 

One doesn't want to have the amplitude of the perturbation 
a being too small (or the optimization is slow) or too large 
(then the perturbed noise is rejected often). To accelerate the 
procedure the amplitude a is chosen according to the following 
negative feedback: Each noise configuration £ has a number A 
attached to it, and the perturbed noise c£ + a\|/ gets the number 
2A attached, while the number attached to £ is decreased by 

'The perturbation of noise vector in the line L4 (including the choice of 
the perturbation amplitude), of course, can be done in many different ways. 



a factor 0.999. In the line L1 the value A — 0.1 is attached. 
The amplitude of the perturbation a is chosen as A: a = A; 
D: 0.1A < a < A; and W: 10 14 < a < 0.1, with uniform 
distribution of logo in both D and W. In comparison to A and 
D, the progress in W is slow — the perturbation amplitude a 
is often too small or too large. 

How w(£(100)) goes down with time is shown in Fig. 6] It 
can be seen that sometimes tv(c(100)) suddenly drops down 
quite a bit — it is happening when beginning part of the 
array (but not £(100)) already went lower in weight, and then 
suddenly a small perturbation withstands 100 iterations, so 
£(100) is updated. Such events are what makes the whole 
procedure work. The progress in the beginning part of the 
array is a lot more regular, and the procedure treasures it in 
hope that it will be converted into the progress at «iter;max- 

The distribution of w(£(100)) is shown in Fig. [7] As it can 
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be seen, the fate of the run is determined quite early. 

At the very beginning there are few rejections of the 
perturbed noise vectors, and with the feedback A [with larger 
choices of a] the weight goes down faster than with D (see 
Fig. |6j. When the weight reaches about 20, the feedback D 
is more effective, probably because eventual smaller than A 
choices of a lead to the perturbations being not rejected more 
often, which keeps the values of A large enough. Eventually 
A is more effective (see Fig. |7J, although such a difference 
between A and D is a bit surprising. 

IV. Discussion 

In the vicinity of the instantons £a and £b there are noise 
configurations that withstand just 3 iterations (see Fig. [2]). Is 
it possible to locate these instantons from the analysis of the 
decoding with small number iterations? 

The lowest instanton weight w describes how fast the 
decoding error probability goes down with SNR in the high 

SNR limit: logP(£) P(SNR) • w. The value of w quickly 

saturates with the number of iterations ni ter , and the decrease 
of with «i ter is probably caused by the thinning of the 

set E(n; ter ) in the vicinity of the instanton. How exactly does 
this happen? 

In the example from Sec. II the magnitude of iterative 
decoder messages was growing linearly with the iteration 
number. Such a linear growth was not observed for the 
instantons £a and £b- Should that be expected? 
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